The Pennsylvania State University, Spring 2021 Stat 415-001, Hyebin Song

Point Estimation

Go to course main page

Point EstimationIntroduction to Point EstimationLearning objectivesRecapThe bias and mean squared error of point estimatorsMethod of Moments EstimationLearning objectivesProcedures to obtain Method of Moments (MoM) estimators of Maximum Likelihood EstimationLearning objectivesProcedures to obtain Maximum Likelihood Estimators (MLE) of Properties of Point EstimatorsLearning objectivesFinite sample propertiesLarge sample (asymptotic) propertiesProperties of MoM estimatorsProperties of MLEsSufficient Statistics and Rao-BlackwellizationLearning ObjectivesSufficient statistics and the factorization theoremRao-Blackwellization

Introduction to Point Estimation

Learning objectives

  1. Understand the goal of point estimation
  2. Understand bias and mean square error of point estimators

 

Recap

Setting: , i.i.d., where is an unknown value in the parameter space .

 

Recall the definitions:

point estimation

  1. A point estimator of :
  2. A point estimate of :

 

Notation:

Recall that for a continuous random variable with a pdf , the probability of an event , the expectation of , and the variance of are computed as

Note: for a discrete random variable , we can replace the pdf with a pmf and the integrations with summations.

We sometimes use subscripts and write , and to emphasize that the expectation and variance are computed using a pdf with a particular value of .

 

The bias and mean squared error of point estimators

Definition (Biased and unbiased estimators)

 

 

Example Suppose we have , i.i.d. where the parameter space . We consider the following three two estimators

Are three estimators unbiased?

For any ,

  1. we have . Therefore is an unbiased estimator of .
  2. . Therefore is an unbiased estimator of .
  3. Therefore is not an unbiased estimator of .

 

Mean Squared Error (MSE)

 

Example (cont'd) Compute MSE of .

 

Method of Moments Estimation

Learning objectives

  1. Understand how to compute method of moments estimators

 

Setting: , i,i,d., where is an unknown value in the parameter space .

 

Idea: Substitution principle

 

Example. , i,i,d. We want to estimate .

. .

 

Procedures to obtain Method of Moments (MoM) estimators of

  1. Write as functions of .

  1. Solve (1) with respect to . That is, find such that

 

  1. Substituted population moments with sample moments

 

Example , i,i,d. We want to estimate .

Write as functions of .

  1. .

  2. Solve 1 with respect to .

    1. .
  3. Substitute population moments with sample moments

Therefore,

Remark: The Method of Moments estimator for is unbiased, but the Method of Moments estimator for is biased

 

Maximum Likelihood Estimation

Learning objectives

  1. Understand how to compute maximum likelihood estimators (MLE)

  2. Understand invariance property of MLE

     

Setting: , i,i,d., where is an unknown value in the parameter space .

 

Idea: Choose the value of that is most likely to have given rise to the observed data

 

Example: Suppose . We have the observed sample . Suppose the parameter space (so the parameter space contains only 2 values).

Given each , what is the probability of observing ?

  1. when ,
  2. when ,

Thus, since , is the maximum likelihood estimate of .

 

Remark:

  1. For any , is a function of .
  2. Finding a maximum likelihood estimate can be viewed as finding a maximizer of the function for .
  3. We call the likelihood function.

 

Definition (Likelihood function)

 

Definition (Maximum Likelihood Estimator)

 

Example. Likelihood and log-likelihood function of based on the observed sample , when i.i.d., .

.

Therefore,

 

Procedures to obtain Maximum Likelihood Estimators (MLE) of

 

  1. Compute the log-likelihood function

    1. Find the pdf or pmf of
    2. Find the log-likelihood function
  1. Find a maximizer of the log-likelihood function

    1. Compute stationary points of the log-likelihood.

      • When the likelihood is differentiable (most cases), find the solution of the equation inside the parameter space.
    2. Find a maximizer among candidate points (stationary points and boundary points).

     

  2. For a given sample , the maximum likelihood estimate is . The maximum likelihood estimator is , which is a random variable.

 

Example (Bernoulli MLE) Compute the MLE of for a sample , when i.i.d., .

 

  1. a. Find stationary points in .

Solving for , we get the solution of .

When , is the unique stationary point in .

When or , there exist no stationary points in .

 

b. When , is the global maximizer, since

, for , and

.

When or ,

and it is straightforward to verify that both functions are monotone and the maximum achieves at and . Then again, is the global maximizer.

Therefore, the maximum likelihood estimate of : .

 

  1. The maximum likelihood estimator of : .

 

Example (Normal MLE, both and unknown) Compute the MLE of for a sample , when i.i.d.,

We have

  1. a. pdf of :

    b. Log-likelihood

  1. a. Find stationary points in

    we get , .

 

b. we can verify the solution in a is indeed the unique global maximizer by using a second derivative condition (positive determinant and negative first element of the hessian matrix) and checking that there is no maximum at infinity.

  1. The maximum likelihood estimator

 

Example (Uniform MLE) Compute the MLE of for a sample , when i.i.d., .

  1. a. pdf of :

    Joint pdf of :

    b. Likelihood:

    Log-likelihood:

  1. a. Find stationary points in .

    For ,

    . No stationary points exist. The log-likelihood is a decreasing function of .

    Therefore, the log-likelihood function is maximized at .

    The maximum likelihood estimator .

 

Theorem (Invariance) If is the MLE of , then for any function , is the MLE of .

Remark Theorem 6.4-1 in HTZ requires that is one-to-one. When is not one-to-one, the discussion becomes more subtle, but we will not worry about this point in this class.

Proof (when is one-to-one)

Let . We have that is the maximizer of .

This function has the largest value .

Therefore, is the maximizer of the function .

 

Example (Bernoulli MLE) Compute the MLE of for a sample , when i.i.d., .

The parameter of interest .

Since the MLE of is , the MLE of is .

 

Properties of Point Estimators

Learning objectives

  1. Understand various finite sample (unbiasedness, sufficiency) and large properties (consistency, asymptotic normality/efficiency) of point estimators
  2. Understand optimal properties of maximum likelihood estimators

 

Finite sample properties

 

 

Large sample (asymptotic) properties

 

Properties of MoM estimators

 

Properties of MLEs

 

Sufficient Statistics and Rao-Blackwellization

Learning Objectives

Sufficient statistics and the factorization theorem

 

 

 

Rao-Blackwellization

Previously, we mentioned that a "good" estimator should be based on a sufficient statistic because a sufficient statistic contains all information about a parameter. In other words, given a sufficient statistic, adding another statistic (i.e., a function of ) only introduces noise, because there is no information left in the conditional distribution of .

In fact, if we have an unbiased estimator which is not based on a sufficient statistic, we can improve the estimator based on the sufficient statistic.

 

Theorem (Rao-Blackwell Theorem) Let be a random sample from a distribution with pdf or pmf . Let be an unbiased estimator of . Let be a sufficient statistic for , and . Then,

  1. is a function of the sufficient statistic , where the function does not depend on . In particular, a statistic.
  2. .
  3. , where the inequality is strict if the original estimator was not a function of alone.

 

Remark 1 the new estimator , which is based on the sufficient statistic , is a better estimator than the original estimator if the original estimator was not a function of alone.

Remark 2 In many cases, by improving an estimator based on the Rao-Blackwell theorem (called Rao-Blackwellization), we not only get an estimator with a smaller variance but we actually obtain a minimum-variance unbiased estimator (MVUE). This is especially true when we use a sufficient statistic from the previous theorem (Sufficient Statistic when a pmf/pdf is of the exponential form).